Unsupervised learning of morphological families: comparison of methods and multilingual aspects

نویسنده

  • Delphine Bernhard
چکیده

RÉSUMÉ. Cet article décrit MorphoClust et MorphoNet, deux méthodes pour l’apprentissage non supervisé de familles morphologiques. MorphoClust forme des familles par groupements successifs, de manière similaire aux méthodes de classification ascendante hiérarchique. La méthode MorphoNet est quant à elle fondée sur la détection de communautés dans des réseaux lexicaux. Les nœuds de ces réseaux représentent des mots et les liens des règles de transformation morphologique acquises automatiquement à partir de mots graphiquement similaires. Nous appliquons ces deux méthodes à un lexique bilingue anglais-allemand, de manière isolée et sous forme combinée, et évaluons les résultats obtenus en utilisant la base de données lexicales CELEX.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison school bonding and interpersonal problems in students with unsupervised and abused families with normal

This study aimed to compare the school bonding and interpersonal problems in students with unsupervised and abused families with normal families in Bandar Lengeh. The sample consisted of 152 normal students and 81 unsupervised or abused students. Normal students were selected by the multi-stage cluster sampling method. Data were collected through two questionnaires: school bonding (Rezaei Shari...

متن کامل

Fast and unsupervised methods for multilingual cognate clustering

In this paper we explore the use of unsupervised methods for detecting cognates in multilingual word lists. We use online EM to train sound segment similarity weights for computing similarity between two words. We tested our online systems on geographically spread sixteen different language groups of the world and show that the Online PMI system (Pointwise Mutual Information) outperforms a HMM ...

متن کامل

Unsupervised Learning of Morphological Forests

This paper focuses on unsupervised modeling of morphological families, collectively comprising a forest over the language vocabulary. This formulation enables us to capture edgewise properties reflecting single-step morphological derivations, along with global distributional properties of the entire forest. These global properties constrain the size of the affix set and encourage formation of t...

متن کامل

Construction of supervised and unsupervised learning systems for multilingual text categorization

Due to the availability of a huge amount of textual data from a variety of sources, users of internationally distributed information regions need effective methods and tools that enable them to discover, retrieve and categorize relevant information, in whatever language and form it may have been stored. This drives a convergence of numerous interests from diverse research communities focusing o...

متن کامل

Unsupervised Multilingual Learning for Morphological Segmentation

For centuries, the deep connection between languages has brought about major discoveries about human communication. In this paper we investigate how this powerful source of information can be exploited for unsupervised language learning. In particular, we study the task of morphological segmentation of multiple languages. We present a nonparametric Bayesian model that jointly induces morpheme s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • TAL

دوره 51  شماره 

صفحات  -

تاریخ انتشار 2010